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ABSTRACT 

In Escherichia coii, tetracycline prevents translation. 
When subject to tetracycline, E. coii express TetA to 
pump it out by a mechanism that is sensitive, while 
fairly independent of cellular metabolism. We con- 
structed a target gene, PfeM-TiRFPI-SeBS, with a 96 
MS2-GFP binding site array in a single-copy BAC 
vector, whose expression is controlled by the tetA 
promoter. We measured the in vivo kinetics of pro- 
duction of individual RNA molecules of the target 
gene as a function of inducer concentration and 
temperature. From the distributions of intervals 
between transcription events, we find that RNA pro- 
duction by PtetA is a sub-Poissonian process. Next, 
we infer the number and duration of the prominent 
sequential steps in transcription initiation by max- 
imum likelihood estimation. Under full induction and 
at optimal temperature, we observe three major 
steps. We find that the kinetics of RNA production 
under the control of PtetA, including number and dur- 
ation of the steps, varies with induction strength and 
temperature. The results are supported by a set of 
logical pairwise Kolmogorov-Smirnov tests. We 
conclude that the expression of TetA is controlled 
by a sequential mechanism that is robust, whereas 
sensitive to external signals. 

INTRODUCTION 

Tetracycline is a polycyclic naphthacene carboxamide (1), 
isolated from Streptomyces genus of Actinobacteria (2), 
which prevents bacterial cell growth by binding to the 
30 S ribosomal subunit. This binding interferes with the 
attachment of aminoacyl-tRNA to the mRNA-ribosome 
translation complex, inhibiting protein synthesis (3,4). 



In Escherichia coii, resistance to tetracychne is conferred 
by the extra-chromosomal Tn70 transposon encoded class 
B molecular determinants, forming the tet operon (5-8) 
(also named divergon (9) or regulon (10)). This operon 
consists of two structural genes, tetA and tetR. tetA is 
essential for tetracychne resistance (11), as it encodes for 
a membrane-targeted antiporter protein, TetA, respon- 
sible for active efflux of tetracychne, whereas tetR codes 
for TetR that regulates the tet operon. 

In the absence of tetracychne, TetR binds to the 
operator sites of tetA and tetR, preventing their transcrip- 
tion (12). In the presence of tetracycline, TetR binds as a 
dimer to the biologically active tetracycline-Mg^"'" 
complex, causing an allosteric conformational change in 
the repressor protein (13). This releases the repressor from 
the DNA, allowing RNA polymerase to bind and initiate 
transcription of tetA and tetR. The tet operon is thus a 
self-repressing system (12,14), capable of fast and efficient 
response to tetracycline. Its expression activity is largely 
independent of the metabolic state of the host cell, making 
it a preferential system to control recombinant gene 
expression (15) and to study mechanisms of gene expres- 
sion (13). Studies of the P-galactosidase activity of the 
Tn70 tetR-tetA promoter have showed that P,^.,^ is the 
strongest of the three promoters of this operon (16). It 
has thus been isolated, modified, and used in several 
studies (namely the tetR/0 region) of gene expression 
(4), and its derivatives have been used in several synthetic 
circuits (17-23). 

Studies suggest that gene expression kinetics in prokary- 
otes appears to be mainly controlled at the transcriptional 
level, particularly in initiation (24). In vitro measurements 
suggest that, in E. coii, transcription initiation is a 
sequential process (25). The first step, named 'closed 
complex formation', is the binding of the RNA polymer- 
ase enzyme to the promoter region and finding of the 
transcription start site (TSS). This is followed by 
isomerization, DNA unwinding and loading of the 
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nucleotide strand (the 'open complex formation') (25). 
Afterwards, elongation of the RNA strand takes place 
(26), and once the termination site is reached, the 
single-stranded RNA molecule and the RNA polymerase 
are released (27). This sequential process (28) is subject to 
tight regulation that takes place at one or more stages. 

Evidence suggests that, for most promoters in E. coli, 
the open complex formation is the main rate-determining 
step in initiation (29,30). However, the durations of other 
steps, namely the closed complex formation, isonieri- 
zation, and promoter clearance are also sequence- 
dependent, differing widely between promoters and in 
different conditions (31). Environmental factors can also 
affect this kinetics (32,33). For example, apart from induc- 
tion (34), factors such as temperature and pH can also 
affect the "rate-limiting" steps in initiation (30,35). 

Changes in the kinetics of these steps allows the overall 
expression rate of a gene to be changed (8,25). For 
example, in vivo measurements of the expression of the 
synthetic promoter PLtcto-i using the luciferase reporter 
system have shown that the rate of RNA production can 
change by 5000-fold with induction by anhydrote- 
tracychne (aTc) (8). However, it is noted that this 
promoter was engineered with the aim of allowing tight 
regulation (wide range of induction) (8). Among the 
several tetracycline analogs, aTc was found to be an 
effective inducer even at very low concentrations 
(<50ng/ml), as it binds to the TetR protein with very 
high affinity (~35-fold higher than tetracychne) (36-38) 
and is less toxic to cells with a minimal inhibitory concen- 
tration of 4 ^g/ml (39). 

If transcription initiation is the rate-determining step in 
RNA production, and if it is composed of several sequen- 
tial steps (30-32,40,41), provided that these are approxi- 
mately exponentially distributed in duration, one would 
expect that transcript production is a sub-Poissonian 
process (40). However, recent measurements of in vivo 
cell-to-cell diversity in RNA numbers have exhibited 
higher diversity in RNA numbers than would be 
expected from a sub-Poissonian process of transcript pro- 
duction (42^4). In (44), it was hypothesized that this was 
because of either super-Poissonian RNA production or 
non-Poissonian RNA degradation. In (42,43), it was 
hypothesized that the cause is the existence of periods of 
activity and inactivity of the promoter. It is noted that 
measurements of cell-to-cell diversity in either RNA or 
protein numbers are affected by several processes other 
than transcription and translation. Specifically, the 
kinetics of degradation of RNA and proteins can be com- 
plex (45), that is, non-Poissonian, and there may be 
non-negligible differences in measurements using in vivo 
and in vitro techniques (24). Finally, another event that 
affects cell-to-cell diversity in RNA numbers is cell 
division, particularly because the intracellular environ- 
ment of E. coli is not well-stirred (46). Even if the RNA 
molecules are partitioned in an unbiased fashion, the sto- 
chastic nature of this process will enhance the cell-to-cell 
diversity in the numbers of these molecules following 
division events (47,48). In addition, bacteria are known 
to partition unwanted protein aggregates in a biased 
fashion (49,50), including, for example, RNA tagged 



with MS2 coat protein fused with Green Fluorescent 
Protein (MS2-GFP) (51,52), and fluorescent proteins such 
as Tsr- Venus (53,54,55). This bias in partitioning ought to 
exacerbate cell-to-cell diversity in RNA and protein 
numbers when assessed by these methods. The degree of 
stochasticity in transcription thus needs to be assessed 
without the interference of subsequent events, by 
measuring transcript production one event at a time (40). 

Here, using in vivo, single-molecule based techniques, 
we characterize the kinetics of initiation of P,^.,^ (15,56), 
including its stochasticity and how it changes with induc- 
tion strength and temperature. Namely, we assess the tran- 
script production dynamics at the single event level. This is 
possible by a method recently developed to tag RNA 
in vivo in E. coli with MS2-GFP proteins, which allows 
individual transcription events to be detectable shortly 
after production (55,57), and the behaviour is similar to 
that of the unlabelled system (57). 

We report measurements of time intervals between con- 
secutive productions of RNA molecules under the control 
of 

PtetA when subject to several induction strengths and 
temperatures. From the distribution of these intervals, we 
analyse the dynamics of this promoter. We address the 
following questions. What is the in vivo kinetics of RNA 
production, one event at a time, under the control of P,c,m, 
when fully induced? How does the in vivo kinetics of tran- 
scription change with induction? How noisy is this 
process? Finally, from the inference of number and dur- 
ations of the rate-limiting steps, we address how the 
kinetics of the rate-limiting steps changes with tempera- 
ture. In the end, we compare the kinetics of initiation of 
PfetA with that of the P/ac/ara-i promotcr, also named P/^r, 
which has been recently characterized (40) using the same 
methods. 



MATERIALS AND METHODS 

Chemicals 

For routine cultures, the components of Luria-Bertani 
(LB) broth (Tryptone, Yeast extract and NaCl) were 
purchased from LabM (UK) and antibiotics from 
Sigma-Aldrich (USA). Phusion high-fidelity polymerase 
and other PCR reagents are from Finnzymes (Finland). 
Fermentas kits (Finland) for Plasmid isolation and PCR 
product extraction and purification were made as per the 
instructions provided. To perform qPCR, cells were fixed 
with RNAprotect bacteria reagent (Qiagen, USA). The 
Tris and EDTA for lysis buffer were purchased from 
Sigma-aldrich (USA) and lysozyme from Fermentas 
(USA). The total RNA extraction was done with 
RNeasy RNA purification kit (Qiagen, USA). DNase I, 
RNasefree for RNA purification, was purchased from 
Promega (USA). iScript Reverse Transcription Supermix 
for cDNA synthesis and iQ SYBR Green supermix for 
qPCr were purchased from Biorad (USA). Agarose for 
microscopic slide gel preparation and electrophoresis 
and isopropyl b-D-l-thiogalactopyranoside (IPTG) and 
aTc for induction of cells are from Sigma-Aldrich 
(USA). For staining DNA and RNA on gels, SYBR- 
Safe from Invitrogen (USA) was used. 
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Bacterial strain and growth conditions 

The strain E. coli DH5a-PRO (identical to DH5a-Zl) (57) 
was used to clone and express the target and reporter 
genes. For overnight cultures, the strain from glycerol 
stock was inoculated in LB broth lOg of tryptone, 5g of 
yeast extract and 5g of NaCl per litre, pH 7.0) (56) with 
appropriate antibiotics (100 ng/ml ampiciUin and 35 |.ig/ml 
chloramphenicol) and incubated at 37°C with shaking 
(250 rpm). 

Genetic constructs 

We constructed the target gene Pf^,, ^-mRFPl-96BS with a 
96 MS2-GFP binding site array in a single-copy BAC 
vector by restricting out the P/„r promoter with BamHl 
restriction endonuclease from a BAC clone carrying a 
target gene ¥u,,-mRFPl-96BS (42) (a kind gift from Ido 
Golding, University of Illinois, IL), and replacing it with 
PfetA amplified from the pTetLuxl plasmid (56). The 
primers (Forward: 5'GGGATCCCTCACATGACCCGA 
CAC 3' and Reverse: 5'GGGATCCACTGCAATCGCG 
ATAGC 3') were designed to amphfy the P,,,,^ promoter 
with BamHI restriction site flanking regions. The 
amplicon and the BAC vector were subjected to BamHl 
restriction digestion, followed with hgation of the 
amplified product into the BAC vector. Thus, we 
obtained a single copy F-based plasmid carrying the 
target region Y',i,,A-mRFPl-96BS. This product was 
transferred into the competent E. coli strain DH5a-PRO 
host cells. The recombinants were selected with antibiotic 
screening and further confirmed with sequence analysis. 
The reporter molecules to visualize the target RNA were 
expressed from the pZS12MS2-GFP plasmid (55) (SClOl 
origin, 6-8 copies per cell, Amp"*^, Puuco-i promoter) 
cloned into the host strain, a kind gift from Philippe 
Cluzel, University of Chicago, IL. The tetR gene that 
encodes for a regulatory protein TetR, is integrated into 
the chromosome of E. coli strain DH5a-PRO, under the 
control of a strong promoter Pn25, that ensures appropri- 
ate levels of repressor proteins for tight regulation and full 
induction, in spite of the residual binding affinity of 
tetR-aTc complex to DNA (8,55). 

A detailed map of the tetA promoter sequence with the 
crucial elements, such as the TetR binding site, the —10/- 
35 regions, the TSS and the ribosome binding site region, 
as well as the beginning of the MS2-GFP binding region is 
shown in Figure 1. Note that there is no tetA gene in the 
genome of E. coli DH5a-PRO or in the genetic constructs 
that were transformed into the strain. 

Induction of expression of the target gene and of the 
reporter gene 

From the overnight culture, cells were inoculated into a 
fresh LB medium supplemented with antibiotics, with 
initial OD of 0.1 at 600 nm and incubated at a specific 
temperature (24°C or 37°C) to mid-logarithmic phase 
with 0.5 OD. To induce the production of MS2-GFP 
proteins, IPTG (1 niM) was added in the medium at 0.35 
OD. The target mRNA from ?„,,A-mRFPl-96BS was then 
induced by adding aTc to the hquid culture. The target 



mRNA is rapidly tagged by the MS2-GFP proteins in the 
cytoplasm and can be detected as fluorescent spots soon 
after transcription occurs (57). 

Quantitative PCR for mean mRNA quantification 

The quantification of changes in the mean transcripts pro- 
duction rate of the target gene with induction strength and 
temperature, relative to a reference gene, were validated 
with qPCR. For the experimental samples, 10 ml of 
cells with 0.5 OD at 600 nm were induced with aTc 
(5-25ng/nil) alone for 1 hour in hquid culture at a 
specific temperature (24°C or 37°C). Cells were then im- 
mediately fixed with RNAprotect bacteria reagent 
foUowed by enzymatic lysis with Tris-EDTA lysozyme 
buffer (pH 8.3). From the lysed cells, total RNA was 
isolated with RNeasy RNA purification kit. The total 
RNA was separated by electrophoresis through a 1% 
agarose gel and stained with SY BR® Safe DNA Gel 
Stain. The RNA was found intact with discreet bands 
for 16 S and 23 S ribosomal RNAs. To remove DNA con- 
tamination, RNA samples were treated with DNase I, 
RNase-free enzyme as per manufacturer's instructions. 
The A 260nm/280nm ratio for the RNA samples 
assessed using GeneQuant pro UV/Vis Spectrop- 
hotometer (80-2114-98) were 2.0-2.1, indicating highly 
purified RNA, and the yield was estimated to be 0.4- 
0.5 ng/nl. cDNA was synthesized from 1 ng of RNA 
with iScript Reverse Transcription Supermix according 
to manufacturer's instructions and stored at — 20°C. The 
qPCR master mix contained iQ SYBR Green supermix 
with primers for the target and reference genes at a final 
concentration of 200 nM. We used three reference genes 
(16 S rRNA (42,58), 23 S rRNA (59) and dxs (60)) for 
internal reference, and similar patterns were observed in 
all cases. In the results section, we show the data relative 
to the 16 S rRNA reference gene, whereas in the supple- 
ment, we show the data relative to dxs (Supplementary 
Figure S2). 

The primers for the target mRNA were (Forward: 5' TA 
CGAC GCCGAGGTCAAG 3' and Reverse: 5' TTGTG 
GGAGGTGATGTCCA 3') to the region of mRFPl 
(GenBank Accession Number: AF506027) (61) with 
amplicon length 90 bp and for the reference gene 16 S 
rRNA (EcoCyc Accession Number: EG30090) 
(Forward: 5' CGTCAGCTCGTGTTGTGAA 3' and 
Reverse: 5' GGACCGCTGGCAACAAAG 3'), with 
amplicon length 74 bp (40), and primers were obtained 
from Thermo Scientific. The template for the reaction 
was 20 ng of cDNA with similar PCR efficiencies for 
both the target and reference genes, both greater than 
95%. The thermal cychng protocol used was: 94°C for 
15 s, 54°C for 30 s, and 72°C for 30 s up to 36 cycles, 
and in the end, one cycle of 94°C for 15 s. The fluorescence 
was read at the end of each cycle. These reactions were 
performed in three experiments, each with three replicates 
per condition with a final reaction volume of 50 ^1. No-RT 
controls and no-template controls were used to crosscheck 
non-specific signals and contamination. The reaction was 
carried out in low-profile tube strips in a MiniOpticon 
Real time PCR system (Biorad). The Cq values generated 
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A PTetA-mRFP1-96bs sequence in BAG 

tetR binding site tetR binding site mRFP1 96bs 

5' CCATCGAATGGCCAGATGATTAATTCCTAATTTTTGTTGACACTCTATCATTGATAGAGTTATTTTACCACTCCCTATCAGTGATAGAGAAAAGTGAA ATG TAGCTGCG 3' 

-35 -10 +1 +104 +778 

B PN25-tetR sequence in E.coli genome 

tetR 

5' att to TCATAAAAAATTTATTTGCTTTCAGGAAAATTTTTCTGTATAATAGATTCA_ ATGTCTAGAT GAAAGTGGGTCTTA T1 att 3' 

-35 -10 +1 +97 +623 

Figure 1. (A) P,^,^-niRFPl-96h.<i-BAC plasmid map: nucleotide sequence of the tet regulatory region on the BAC plasmid, beginning with "P,^,^ (-35 
and —10 consensus sequences), TSS (+1), pahndromic patterns corresponding to two TetR binding sites (blue), niRFPl start site at +104, and 96bs 
start site at +778. (B) \'n25-tetR sequence integrated in the E. coli genome: P,v25 controlled TetR protein coding gene and the lambda attP site. 
Terminators to and Tl prevent transcription from the integrated promoters into the neighbouring regions of the E. coli genome. 



by CFX Manager"^ Software were imported into 
Microsoft Excel, and the data were analysed following 
the Livak method (62) to obtain the fold changes in the 
target gene, normalized to the reference gene, and to cal- 
culate the standard error between experiments. 

Fluorescent microplate reader measurement for mean 
protein levels 

The mean fluorescence of the mRFPl protein under the 
control of P,,.,^ was measured with a Thermo Scientific* 
Fluoroskan Ascent Microplate Fluorometer. Cells at 0.5 
ODfioonm were induced with 1 5 ng/ml of aTc and incubated 
at a specific temperature (24°C or 37°C) for 1 hour with 
shaking. The optical density of induced and non-induced 
cells was measured after 1 hour. From that, 0.5 ODgoonm 
of cells were taken, centrifuged and then re-suspended 
with fresh medium of 1:200 folds dilution. From this, 
1 50 1^1 of cells were taken and placed on 96 well microplate 
and measured for relative fluorescence levels of mRFPl 
protein with excitation (584 nm) and emission (607 nm) 
wavelengths (61). The cell density was kept identical in 
all wells of the plate for all measured conditions. We per- 
formed three independent experiments with three repli- 
cates for each condition. 

Time-lapse microscopy 

Cells were induced with IPTG and aTc as described earlier 
in the text. Five minutes after induction in liquid culture 
by aTc, cells were placed on a microscope slide between a 
coverslip and 1% LB-agarose gel with IPTG (1 niM) and 
aTc (15 ng/ml), to maintain full induction under the 
microscope. Cells were visualized in a Nikon Eclipse 
(TE2000-U, Nikon, Japan) inverted CI confocal laser- 
scanning system with a lOOx Apo TIRF (1.49 NA, oil) 
objective. The slide was kept in a temperature-controlled 
chamber, and the cells were focused within a few seconds 
under light microscope. Images were collected once per 
minute up to 1 hour under the fluorescence confocal 
microscope. Image acquisition began approximately 
20 minutes after induction (including the 5 minutes induc- 
tion in liquid culture). This interval is sufficient to reach a 
steady state level of induction (55,63). For image acquisi- 
tion, we used Nikon software EZ-Cl, under dark condi- 
tion to minimize photolysis of aTc (photobleaching and 
nutrient depletion only become significant after 2 or more 



hours). GFP fluorescence was measured using a 488 nm 
argon ion laser (Melles-Griot) and a 51 5/30 nm detection 
filter. Images were acquired using medium pinhole, gain 
130 and 1.68 |is pixel dwell. On the shde, the division time 
of the ceUs was approximately 40 minutes, hkely because 
of the imaging. 

We used a recent interacting multiple model filter based 
autofocus strategy (64). The method relies on the nature of 
the focal drift and exploits the interacting multiple model 
filter algorithm to predict the focal drift at time t based on 
the measurement at time t-1. It allows a drastic reduction 
of the number of required z-slices for focal drift correc- 
tion, thus minimizing photo bleaching. 

Cells and spots segmentation and the intensity jump- 
detection method 

Detection of cells from the images is performed by a 
semi-automatic method (40). It consists of manually 
masking the regions that cells occupy during the 
iinaging. For each linage, the locations, dimensions and 
orientations of the cells within their masks were estimated 
by principal component analysis, assuming that the fluor- 
escence inside the cell is uniformly distributed. This as- 
sumption is supported by measurements of the pixels 
intensities inside each ceU, which were found to be fairly 
uniform (40). The segmentation of fluorescent spots 
(tagged RNAs) was performed by a kernel density estima- 
tion method (65) with a Gaussian kernel. An example of 
the results of the segmentation is shown in Supplementary 
Figure SI. 

From these data, we compute the cell-background- 
subtracted total spot intensity time traces for each cell. 
Because the tagged RNAs do not degrade during the 
experiiTient (shown in the 'Results' section), this intensity 
should foUow a inonotonically increasing piecewise- 
constant function, where the jumps correspond to the ap- 
pearance of novel mRNAs. This was verified by inspection 
(40). We fit such a function to the time trace by least 
squares, where the number of pieces in the function is 
detennined by an f-test with a P value of 0.01, thus 
requiring higher order curves to fit significantly better to 
justify their use. Some intermediate results of this proced- 
ure, along with raw data, are shown in Figure 2. In 
Supplementary Data, we provide movies of two cells 
showing the time (in seconds) when each frame was 
obtained following induction of the target gene. 
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Inference of the number and duration of the sequential 
steps in transcription 

From the distribution of intervals between productions of 
consecutive RNA molecules, we infer by maximum- 
likelihood the number and duration of the sequential 
steps in transcription initiation (40). We assume that the 
duration of each of the sequential steps follows an expo- 
nential distribution (40). Although the steps in initiation, 
such as the open complex formation, are hkely not elem- 
entary (25), it was possible by this method to fit well the 
measured distributions in the case of P/„,. using a small 
number of steps that is consistent with the number of 
steps believed to be rate-limiting from in vitro studies 
(31,32). To support the results of the inference, we 
perform Kolmogorov-Smirnov tests to compare the 
measured distribution with the inferred one that best 
fits the data. This test is used to determine whether the 
inferred distribution does not lit the measured 
distribution. 

The inference procedure assumes that the measured 
intervals between the productions of consecutive RNA 
molecules are not significantly affected by elongation. 
This relies on the fact that the mean duration of the inter- 
vals between consecutive transcription events is of the 
order of 600 s or higher, depending on induction 
strength and temperature (see Results section and 
in vitro studies (8)). On the other hand, elongation of the 
target gene was measured to take only tens of seconds 
(57). Possible sequence-dependent events, such as long 
transcriptional pauses, can also be ruled out as affecting 
significantly the measured distributions because, if 
existing, they would last only 10-100 s (e.g. 32 s half-life 
for the ops pause and 52 s for the his pause) (66). In 
addition, target RNA molecules become visible even 
while elongating (57), further diminishing possible effects 
of events in elongation in the measured distributions. In 




2 



to n 

CO 



0 1000 2000 3000 4000 5000 6000 7000 
Time (s) 

Figure 2. Tagged RNAs in E. coU cells. (A) Unprocessed frames and 
segmented cells and RNA spots. The moments when images were taken 
are shown for each frame. (B) Examples of time series of scaled spot 
intensity levels from individual cells (circles) and the corresponding 
estimated RNA numbers (solid lines). The cell shown in (A) does not 
correspond to any of the traces in (B). 



any case, although the elongation process may increase the 
variance of the distributions, it would not affect their 
means. Finally, the eventuality of possible premature ter- 
minations of transcription events can be ruled out in this 
case because they would generate distributions of intervals 
between transcription events with multiple peaks, centered 
on multiples of the mean interval between productions, 
which were not observed (see 'Results' section). 

Here, we assume a d-steps model such that the dur- 
ations of the d steps are exponentially distributed and 
are independent, with possibly different rates. The fit for 
each d-steps model is obtained by maximum hkelihood 
estimation. The hkelihoods are compared using hkelihood 
ratio test, and the model with smallest d is selected that 
cannot be rejected at the significance level 0.01 in favour 
of a higher order model. This method was found to 
reliably distinguish the number of steps and the duration 
between any two steps when they differ by ~25% or more 
in duration, from 200 intervals sampled from a stochastic 
model of gene expression with d exponentially distributed 
steps (40). Note that this method does not allow us to 
determine the temporal order of the sequential steps 
inferred. Only their number and durations can be assessed. 



RESULTS 

We study the in vivo kinetics of production of individual 
RNA molecules, as a function of inducer concentration 
and temperature, under the control of P,e,^. First, we 
measured the relative mean RNA levels by qPCR as a 
function of induction and temperature. In Figure 3, we 
show the results using the 16 S rRNA gene as reference, 
whereas in supplement, we show the results using dxs as 
reference (Supplementary Figure S2), which are similar to 
16 S rRNA. Previous studies suggest that the maximum 
induction is achieved with a concentration of aTc of 
20ng/ml or higher (8,56). The results (Figure 3 and Sup- 
plementary Figure S2) are in agreement, indicating that 
there is no significant increase in the rate of RNA produc- 
tion beyond 15ng/ml of aTc. Froin these figures, we find 
that both the temperature and the inducer concentration 
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Figure 3. Relative expression level of target mRNA induced with dif- 
ferent concentrations of aTc (ng/ml) at 24°C and 37°C, quantified by 
qPCR using the 16 S rRNA gene as reference. The standard deviation 
bars are from three independent experiments. In some cases, these bars' 
lengths are too small to be visible. 
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significantly affect the rate of transcription under the 
control of P,c^ia- 

From here onwards, we focus on three conditions. 
Specifically, we measure gene expression in the absence 
of aTc at 37°C, and with 15ng/nil aTc at both 37°C and 
24°C so as to study how the kinetics changes with induc- 
tion and temperature. First, we verify whether, for these 
conditions, the relative protein expression levels follow 
those of the RNA. Results of the measurements of 
relative fluorescence levels by microplate fluorometer are 
shown in Figure 4, and confirm that the protein levels 
follow the RNA levels. They also show similarity (same 
order of magnitude) to the measurements reported in (56) 
for this promoter using a luminescent reporter system, 
even though the cells in (56) were in late log-phase and 
had a 90min induction period. 

We next study the kinetics of RNA production in live, 
individual cells using the MS2-GFP tagging method 
(40,42). As described in the Methods section, the expres- 
sion of the target gene is controlled by PtetA and is induced 
by aTc. The sequence of the target gene contains 96 
binding sites for the MS2 coat protein. Because of these, 
the reporter proteins (MS2-GFP) can bind to the target 
RNA, producing a fluorescent spot that is detectable from 
fluorescence microscopy images. 

Using this system, we first measured the cell-to-cell di- 
versity in the number of tagged RNA molecules produced 
by individual cells over a certain period of time in all three 
conditions. In both the induced and non-induced cases, 
cells are placed under the confocal microscope 2 h foUow- 
ing induction by IPTG. In the induced cases, the induction 
by aTc is done 1 h following the induction by IPTG. 

From the images, we extracted the number of target 
RNA molecules in each cell, and calculated the 
mean and standard deviation of the number of RNA 
molecules in individual cells (Figure 5). We observe 
that the measured mean is in agreement with the measure- 
ments by qPCR (Figure 3), though the in vivo measure- 
ments have a shghtly smaller relative increase with 
induction. 



We also extracted the fraction of cells with a given 
number of RNA molecules (Figure 6). In the non-induced 
case (0 ng/ml aTc, 37° C), the variance of the distribution is 
0.62, and the mean is 1.0. In the induced case, the variance 
is 2.5, and the mean is 3.6. Finally, for ceUs induced with 
15 ng/ml aTc and incubated at 24° C, the variance of the 
distribution is 1.2, and the mean is 2.2. These values are of 
significance in that, in all cases, the variance is smaller 
than the mean (i.e. the Fano factor of the distribution is 
smaller than 1). If the process of RNA production was 
Poissonian, that is, if the intervals between consecutive 
productions were independent and followed an exponen- 
tial distribution, the variance ought to be equal to the 
mean (i.e. Fano factor equal to 1). Because the tagged 
RNA molecules do not degrade, this can only be explained 
either by the dynamics of RNA production or of RNA 
partitioning in division (or the combination of these 
processes). 

To investigate this, we obtained distributions of inter- 
vals between productions of consecutive RNA molecules 
in individual cefls in the three conditions (Figure 7). These 
distributions are not affected by the dynamics of RNA 
partitioning in division because we only count intervals 
between consecutive RNAs in single cells. In each case, 
several independent measurements were made, and the 
results were combined. 

The number of cells observed in each condition is shown 
in Table 1 along with the number of intervals detected, as 
weU as the square of the coefficient of variation (CV^, 
defined as the variance over the mean squared) of the 
intervals. It is noted that the mean duration of the inter- 
vals for the aTc 15 ng/ml (37°C) condition is in close 
agreement with in vitro measurements of the time it 
takes for a transcription initiation event to be completed 
once initiated (35). 

From Figure 7, in all cases, the shapes of the distribu- 
tions of intervals are not exponential-like. This implies 
that the process of RNA production under the control 
of PtetA is not Poissonian. Instead, because the standard 




0 ng/ml, 37 °C 15 ng/ml, 24 °C 15 ng/ml, 37 °C 

Figure 4. Relative mean expression level of target proteins (mRFPl) 
estimated by microplate fluorometer in three conditions. The error bars 
are the standard deviation from the measurements in the different wells. 
For precise quantities, see Supplementary Table SI. 
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0 ng/ml, 37 °C 15 ng/ml, 24 °C 15 ng/ml, 37 "C 
ale Concentration and Temperature 

Figure 5. Mean levels of the target mRNA relative to the non-induced 
condition at 37°C obtained from live ceU imaging of the RNA tagged 
with MS2-GFP under the confocal microscope. linages taken 1 h after 
induction by aTc in three conditions. The error bars are the standard 
deviation of the number of RNA molecules in each cell. For precise 
quantities, see Supplementary Table S2. 



8478 Nucleic Acids Research, 2012, Vol. 40, No. 17 



deviations of the distributions are smaller than the means, 
resulting in a CV^ below 1 (Table 1), we can conclude that 
this process is sub-Poissonian. This explains the low values 
of cell-to-cell diversity in RNA numbers observed in cell 
populations (Figure 6). 




No. of target RNA molecules 

Figure 6. Distribution of the fraction of cells with a given number of 
mRNA molecules, 1 h following induction by aTc (when induced) 
obtained from live cell imaging in three conditions. 



From the comparison of the distributions A and C in 
Figure 7, we can assess the effects of induction in the 
dynamics of transcription under the control of Ptc-tA- 
Note how distribution C, aside from having a smaller 
mean, is also slightly more exponential-like, explaining 
why the CV^ in RNA numbers is closer to 1 in case C 
than in case A (Table 1). From the comparison of the 
distributions B and C, we can assess the effects of 
lowering the temperature in the dynamics of transcription 
under the control of P,c^,a- When the temperature is 
reduced from 37°C (distribution C) to 24°C (distribution 
B), the mean interval between transcription events in- 
creases, and the shape of the distribution changes signifi- 
cantly. Namely, distribution B, corresponding to full 
induction at 24° C, is more exponential-hke than distribu- 
tion C. 

These results rely on the intensity jump-detection 
method. This method assumes that the target RNA mol- 
ecules are quickly bound by the MS2-GFP tagging 
proteins once transcribed (which was verified in (55)) 
and that, once that occurs, they do not degrade during 
the measurement. To validate this assumption of "immor- 
tahty", we studied the kinetics of degradation of these 
complexes. The MS2-GFP proteins used here are an 
assembly-defective mutant with the FG loop deleted 
(55,67). A recent study suggested that when using a 
target RNA with 48 binding sites for these MS2-GFP 
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Figure 7. Distributions of time intervals between productions of consecutive mRNA molecules in individual cells under the control of P,^,^ in 
conditions: (A) Ong/ml aTc at 37°C, obtained from 157 cells and 43 intervals (B) 15ng/ml aTc at 24°C, obtained from 119 cells and 100 intervals, 
and (C) 15ng/ml aTc at 37°C, obtained from 113 cells and 254 intervals. The probability density functions of inferred models of transcription 
initiation with differing number of rate-limiting steps are also shown in each plot. Note the different scales in the >'-axis. 
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proteins, the resulting RNA-MS2-GFP complex may 
degrade during measurement sessions of a couple of 
hours (46). This would interfere, to some extent, with 
the intensity jump-detection method (see 'Materials and 
Methods' section). However, our target mRNA has 96 
binding sites, and therefore the kinetics of degradation 
likely differs. We studied the degradation kinetics of the 
target RNA when bound the MS2-GFP and observed 
~120 RNA-MS2-GFP complexes produced in 50 cells 
for up to 2h. We did not observe any degradation event 
of these complexes nor did we observe any significant loss 
of brightness in individual spots. We conclude that the 
additional MS2-GFP molecules provide sufficient stability 
to the complex, for us to consider them to be immortal for 
all practical purposes, implying that the total brightness of 
the spots within a single cell monotonically increases with 
time. 

We next study the kinetics of the underlying processes 
responsible for shaping the interval distributions 
(Figure 7) and thus the observed cell-to-ceU diversity in 
RNA numbers (Figure 6). For this, certain assumptions 
are necessary. First, we assume that the distributions are 
mainly shaped by the kinetics of transcription initiation. 
This rehes on the following: in all distributions in Figure 7, 
the mean interval duration is higher than ~500s. 
Therefore, elongation is not expected to play a significant 
role in shaping the distributions because it lasts only tens 
of seconds (57). Also, as noted, the target RNA, when 
bound by MS2-GFP proteins, does not degrade during 
the measurements. Because of this, we assume that the 
distributions are shaped by transcription initiation, 
which includes steps such as the closed complex forma- 
tion, isomerization and open complex formation 
(8,25-32,35,40,41,68). 

We estimate by maximum hkelihood the number and 
duration of the most prominent "rate-limiting" steps, that 



Table 1. Number of cells analysed, number of intervals between pro- 
ductions of consecutive RNA molecules detected in individual cells, 
mean interval duration, standard deviation of interval durations and 
CV^ of the interval durations for each condition 



Condition 


aTc Ong/ml 


aTc 15ng/ml 


aTc 1 5 ng/ml 




(37°C) 


(24°C) 


(37°C) 


No. cells 


504 


178 


113 


No. intervals 


180 


152 


254 


Interval mean (s) 


939 


974 


617 


Interval std (s) 


459 


676 


367 


Interval CV^ 


0.21 


0.48 


0.35 



is, the ones that shape the distributions of intervals 
between transcription events (40). The results are shown 
in Figure 7 for each condition when assuming one, two 
and three steps. In Table 2, we show the log-likehhood 
values and the durations of the inferred steps for 1-step, 
2-step and 3-step models, for each condition. The number 
of steps can be determined by using a hkelihood-ratio test 
between pairs of models to reject a lower-degree model in 
favour of a higher-degree one (55). In Table 3, we show 
the results of the hkelihood-ratio tests. For all distribu- 
tions, the test rejects the 1-step (exponential) model in 
favour of the 2-step model {P values < 8.32 x 10~^). For 
distributions A and C, the 2-step model is also rejected in 
favour of the 3-step model (P values < 0.0019). 

The time scales of the steps (for d = 2) are identical for 
all cases. As discussed in a previous work (40) this may be 
because of some unknown artefact of the inference 
method or be representative of the real kinetics of tran- 
scription initiation of this promoter. The method of infer- 
ence was found to rehably distinguish the duration of each 
step when they differ by approximately 25% or more in 
duration, from 200 intervals sampled from a model of 
gene expression (40). For smaller differences, the 
solution can be biased toward inferring steps with identi- 
cal durations, for unknown reasons. Nevertheless, given 
the number of intervals measured, it is possible to 
conclude that the steps do not differ by more than ap- 
proximately 25%. 

From Tables 2 and 3, provided that the assumed se- 
quential model of transcription initiation (8,25- 
32,35,40,41,68) is correct, we conclude that, when not 
induced with aTc, transcription initiation controlled by 
P,e,^ has 3 rate-limiting steps, which are similar in 
duration (differing by less than 100-150 s between them). 
When fully induced, at 24°C, there are two dominant 
rate-limiting steps, similar in duration. Finally, at 37°C 
under full induction, there are three rate-limiting steps, 
two longer and similar in duration, and a (clearly) 
shorter third step. No significant improvement was 
obtained in the fit with more steps in any of the 
conditions. 

We conclude that there are three rate-limiting steps in 
transcription initiation of P,^,^. By lowering the tempera- 
ture, two of the steps become longer in duration, whereas 
the third step remains unaltered and, because of its now 
relatively much shorter duration, it becomes barely detect- 
able {P value of 0.0268, Table 3). Interestingly, the other 
two steps are not significantly affected by temperature 
(compare cases B and C in Table 2). Induction on the 



Table 2. Log-likelihood and durations of the steps of the inferred models with d steps, for each condition 



aTc Ong/ml (37°C) aTc 15 ng/ml (24°C) aTc 15 ng/ml (37°C) 



d Log-likelihood Durations (s) Log-likelihood Durations (s) Log-hkelihood Durations (s) 



1 -1412 939 -1198 975 -1886 617 

2 -1369 (470,470) -1174 (487,487) -1836 (309,309) 

3 -1356 (313,313,313) -1171 (620,240,115) -1828 (254,254,109) 



There is no implied temporal order for the steps. 
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Table 3. Likelihood-ratio test P values between pairs of models for 
each condition 



(do, d,) 


aTc Ong/ml 


aTc 1 5 ng/ml 


aTc 1 5 ng/ml 




(37°C) 


(24°C) 


(37°C) 


(1, 2) 


0 


3.02 X 10"'^ 


0 


(2, 3) 


7.27 X 10"' 


0.027 


1.20 X 10"" 


(3, 4) 


0.342 


0.268 


0.370 



The null model is the do step model (where do is 1, 2, or 3) while the 
alternative model is the di step model (where dj = do+ 1) 



Table 4. P values from the Kolmogorov-Smirnov test between the 
empirical distribution and the various inferred models with d steps 



d 


aTc Ong/ml 


aTc 15 ng/ml 


aTc 1 5 ng/ml 




(37°C) 


(24°C) 


(37°C) 


1 


4.84 X 10"" 


4.81 X 10"^ 


1.27 X 10"" 


2 


1.10 X lO""* 


0.725 


0.0485 


3 


0.0782 


0.844 


0.378 


4 


0.662 


0.842 


0.272 



Other hand affects the duration of the three steps (compare 
cases A and C in Table 2). 

It is possible to provide additional support to these 
results, as well as to the assumption that each step 
follows an exponential distribution in duration, using a 
set of logical pairwise Kolmogorov-Smirnov (K-S) tests. 
One can compare the distributions of intervals between 
transcription events, for each condition, between the em- 
pirical cumulative distribution function of each case and 
the corresponding cumulative distribution function of the 
inferred models with d-steps, for all values of d. If the 
model accurately describes the measureinents, the empir- 
ical and the inferred distributions should be indistinguish- 
able by the K-S test. The comparisons (i.e. the P values) 
are shown in Table 4. Usually, for P values smaller than 
0.01, it is concluded that the two distributions differ 
significantly. 

From Table 4, in case A (aTc 0 ng/ml, 37°C), the models 
with less than 3 steps do not accurately match the 
measured data. In case B (aTc 15 ng/ml, 24°C), only the 
I -step model does not accurately match the data. This, as 
noted, is because of the increase in duration in two of the 
steps because of the lower temperature, rendering the ef- 
fects of the third step much less significant in the overall 
distribution of intervals. Finally, in case C, we have the 
same result as in case A, that is, models with less than 3 
steps do not accurately match the measured data. These 
results support the previous conclusions, using the likeli- 
hood ratio test, regarding the number of sequential steps 
that determine the shape of the distribution of intervals in 
each condition. 

Finally, we tested whether during our in vivo measure- 
ments, the kinetics of production of RNA changed over 
time because of possible changes in the intracellular con- 
centration of aTc. This could occur if degradation because 
of light sensitivity of intracellular aTc or its slow diffusion 



across the membrane were significant. That is, if either 
effect is significant, it should be possible to distinguish 
between the distributions of intervals obtained in the 
first 30min and those obtained in the last 30min of the 
hour-long measurements. We extracted these two 
sub-distributions from each induction condition and 
compared them with the Kolmogorov-Smirnov test. The 
Kolmogorov-Smirnov test was unable to differentiate the 
two distributions in any condition (all P values > 0.1), 
demonstrating that the measurements were done at an 
approximate steady state. In particular, for case A 
(aTc Ong/ml, 37°C), the P value was 0.32; for case B 
(aTc 15 ng/ml, 24°C); it was 0.16 and for case C (aTc 
15 ng/ml, 37°C), it was 0.78. 



DISCUSSION 

Previous studies have shown that the kinetics of transcrip- 
tion initiation of PidA is heavily dependent on induction 
and on environmental factors such as temperature 
(8,16,35,56). Other studies have shown that RNA produc- 
tion in bacteria is a stochastic (24,69) and a multi-step 
process (30), generally subject to complex regulatory 
mechanisms (8). Finally, recent studies have shown that 
the cell-to-cell diversity in RNA numbers, and likely 
protein numbers, can be significantly affected by processes 
other than gene expression (47,48,51). Therefore, the as- 
sessment of the kinetics of gene expression and regulation 
requires in vivo measurements of RNA production 
dynamics in individual cells, one event at a time, under 
various induction and environmental conditions (40). 

The measured in vivo distributions of intervals between 
consecutive productions of RNA molecules in single 
cells are found to be sub-Poissonian, for the induction 
levels and temperatures tested. This was also observed in 
the case of P/„,. (40). The sub-Poissonian nature of the 
kinetics explains the low cell-to-cell diversity in RNA 
numbers observed at the cell population level. Relevantly, 
the distributions of intervals, including their mean and 
variance, are found to respond readily and discernibly to 
induction as weU as temperature, revealing the plasticity of 
the expression mechanism of TetA. The plasticity appears 
to arise from the diversity of the changes in the durations 
of the various steps in response to differing induction 
levels and temperature. 

Our results assume that the measurements are only 
affected by intrinsic noise sources in transcription. 
Downstream events, such as translation or RNA degrad- 
ation, do not affect the results, as we detect RNA mol- 
ecules as soon as these are produced and study only the 
time intervals between these events. It is necessary to 
discuss if other noise sources, aside intrinsic noise in tran- 
scription, could affect diversity in number of produced 
RNA molecules per ceU. It may be that differences in 
the amounts of TetR and/or aTc in each ceU could con- 
tribute to this diversity (i.e. be a significant source of 
extrinsic noise (70)). First, the strain used here over- 
expresses TetR (DH5aPRO produces constitutively 
around 7000 dinieric Tet repressors per cell during loga- 
rithmic growth (8)); thus, we expect the contribution of 
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diversity in TetR numbers to the diversity in RNA 
numbers to be negligible. As for aTc, we found no 
evidence, at least at the population level, of varying 
rates of transcription initiation because of varying concen- 
trations of aTc over time (e.g. because of depletion), as we 
found no difference in the kinetics of RNA production 
during the first and second half of the measurement 
period. Additionally, the tetA gene, responsible for 
expressing TetA, which confers tetracychne resistance 
(11) (by pumping it out of the cell) is not present in our 
strain. Finally, diversity in the numbers of either aTc or 
TetR in the cells is expected to only increase diversity, not 
decrease it (and thus cannot be the explanation for the 
observed sub-poissonian kinetics). Nevertheless, it is 
stressed that our measurements are not sufficient to 
predict the cell to-cell-diversity in number of RNA mol- 
ecules under the control of V„,,a as other factors, such as 
RNA degradation (45) and RNA partitioning in cell 
division (47) need to be considered. 

In this regard, measurements by fluorescent in situ hy- 
bridization of RNA numbers under the control of 20 
E. coli promoters (42,43) exhibited high values of cell- 
to-cell diversity. Another work reported Fano factors of 
mRNA numbers for 137 highly expressed genes in E. coli, 
ranging from 1 to 3, using a yellow fluorescent protein 
fusion library for E. coli (44). Although these results 
could be because of super-Poissonian transcript produc- 
tion from the strongly expressing promoters studied, other 
explanations cannot be ruled out. For example, it may be 
that although the production is sub-Poissonian, the 
observed diversity is a result of the subsequent contribu- 
tion from complex RNA degradation mechanisms (45) 
and imperfect partitioning of RNA molecules in cell 
division ([47,48,51), among other things such as the 
biased segregation of unwanted substances (e.g. the fluor- 
escent molecules used to tagged the RNA) (49,52). In any 
case, we note that our measured distributions of intervals 
between transcription events cannot be explained by 
models, such as the "on-off ' model of transcription initi- 
ation (43,71), as this model entails super-Poissonian 
kinetics. 

The kinetics of RNA production under the control of 
P,etA can be explained by a model of transcription initi- 
ation with successive "rate-limiting" steps, each of which 
is exponentially distributed in duration. From the infer- 
ence of the number and duration of these steps in several 
conditions, we found that induction with aTc significantly 
changes the RNA production kinetics by reducing the 
duration of all rate-limiting steps, to various degrees. In 
particular, one of the steps becomes almost indiscernible. 
Meanwhile, lowering temperature under full induction by 
aTc increases the duration of two of the steps, but not of 
the third step, causing the kinetics to be well-fit by a 
two-step model under these conditions. Note that it is 
not possible to determine which steps [e.g. closed 
complex, open complex or isomerization (28)] are 
affected by aTc and temperature. Novel experimental 
techniques are necessary to perform this study in vivo. 
However, we can rule out TetR dissociation from the 
promoter as one of the rate-limiting steps. The complex 
TetR-promoter has a half-life of 12 s (37), which is a much 



faster process than the measured rate-limiting steps. 
Additionally, although TetR, when bound by aTc, may 
retain some abihty to bind to the DNA [although this 
abihty is reduced by approximately 9 orders of magnitude 
(72)], there is no evidence that this complex would have a 
longer half-life than when the DNA is bound by TetR 
alone. 

A recent work used the methods used here to analyse 
the kinetics of P/„,. (40). The measurements were made at 
24°C. It is of interest to compare them with our measure- 
ments regarding the response to induction. First, the 
kinetics of RNA production of P/„,. is also sub- 
Poissonian. Also, induction of P/„,. with IPTG and arabin- 
ose reduces the duration of the rate-limiting steps. The 
main differences between these two promoters are in the 
mean duration of the intervals at 24°C ([for P/„,., this mean 
duration is ~1500s (40), whereas for P,etA, it is ~1000s]) 
and in the variability of the intervals. For P/„,., the CV^ of 
the durations of these intervals is 0.70, whereas for P,^,^, it 
is 0.52. From this, we conclude that the kinetics of tran- 
scription initiation of PtetA is less noisy than that of P/„,.. It 
is worthwhile to mention that the observations of the be- 
haviour of PieiA (a native promoter) suggest that the 
sub-Poissonian kinetics of RNA production is not, for 
example, an artefact of the synthetic nature of P/„r, and 
that it may be a common feature of the dynamics of tran- 
script production in E. coli. Also, the results support 
several previous observations on the effect of temperature 
on the kinetics of transcription, but further show that the 
changes in kinetics are due, in part, to the alteration of the 
mean duration of the intermediate steps in transcription 
initiation. 

Ptc'tA controls the expression of TetA, which is respon- 
sible for the active efflux of tetracychne-Mg^^ complexes. 
This protein's function justifies the need for such a strin- 
gent regulatory mechanism so as to ensure that TetA is 
present in the appropriate amount because both tetracyc- 
line and TetA (in high amounts) are harmful to the cell 
(12). We find that this control is achieved not only by the 
negative feedback mechanism of the tet operon (12) but 
also by a sub-Poissonian kinetics of transcription initi- 
ation. Relevantly, although robust (less noisy than a 
Poisson process), this system is nevertheless sensitive to 
external stimuli, such as tetracychne, and temperature 
because its behaviour discernibly changes with tempera- 
ture and inducer concentration. 

In the future, it wiU be of interest to further analyse how 
the dynamics of PtetA differs in other environmental 
conditions, such as in differing concentrations of 
hydrogen ions and metabolites. Such studies may 
provide insights on the plasticity of the kinetics of gene 
expression in bacteria and thus guide the engineering of 
synthetic genetic circuits with specific behavioural 
patterns. 
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